The cone-beam computed tomography (CBCT) provides 3D volumetric imaging of a target with low radiation dose and cost compared with conventional computed tomography, and it is widely used in the detection of paranasal sinus disease. However, it lacks the sensitivity to detect soft tissue lesions owing to reconstruction constraints. Consequently, only physicians with expertise in CBCT reading can distinguish between inherent artifacts or noise and diseases, restricting the use of this imaging modality. The development of artificial intelligence (AI)-based computer-aided diagnosis methods for CBCT to overcome the shortage of experienced physicians has attracted substantial attention. However, advanced AI-based diagnosis addressing intrinsic noise in CBCT has not been devised, discouraging the practical use of AI solutions for CBCT. To address this issue, we propose an AI-based computer-aided diagnosis method using CBCT with a denoising module. This module is implemented before diagnosis to reconstruct the internal ground-truth full-dose scan corresponding to an input CBCT image and thereby improve the diagnostic performance. The external validation results for the unified diagnosis of sinus fungal ball, chronic rhinosinusitis, and normal cases show that the proposed method improves the micro-, macro-average AUC, and accuracy by 7.4, 5.6, and 9.6% (from 86.2, 87.0, and 73.4 to 93.6, 92.6, and 83.0%), respectively, compared with a baseline while improving human diagnosis accuracy by 11% (from 71.7 to 83.0%), demonstrating technical differentiation and clinical effectiveness. This pioneering study on AI-based diagnosis using CBCT indicates denoising can improve diagnostic performance and reader interpretability in images from the sinonasal area, thereby providing a new approach and direction to radiographic image reconstruction regarding the development of AI-based diagnostic solutions.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
作者归因是确定给定文本的作者的任务。大多数现有方法都使用手动设计的功能来捕获数据集的内容和样式。但是,这种依赖数据集的方法会产生不一致的性能。因此,我们建议使用对比度学习和监督学习(Contra-X)的结合来微调预训练的语言表示。我们表明,Contra-X在多个人类和机器作者身份归因基准上提高了最先进的方法,从而提高了高达6.8%的改善。我们还表明,在不同的数据方案中,Contra-X始终优于跨凝性微调。至关重要的是,我们介绍了这些改进的定性和定量分析。我们博学的表示形成了不同作者的高度可分开的群集。但是,我们发现对比度学习以牺牲某些作者的牺牲成本提高了整体准确性。解决这种紧张关系将是未来工作的重要方向。据我们所知,我们是第一个分析将对比度学习与跨凝性微调相结合的作者归因的效果。
translated by 谷歌翻译
在多人2D姿势估计中,自下而上的方法同时预测了所有人的姿势,与自上而下的方法不同,不依赖于人类的检测。但是,与现有的自上而下方法相比,SOTA自下而上的方法的精度仍然不如较低。这是由于预测的人类姿势是根据不一致的人类边界箱中心进行回归的,并且缺乏人类规范的正常化,从而导致预测的人类姿势被遗漏了不准确和小规模的人。为了推动自下而上的姿势估计的信封,我们首先提出了多尺度训练,以增强网络以通过单尺度测试来处理规模变化,尤其是对于小规模的人。其次,我们介绍了双解剖中心(即头部和身体),在这里我们可以更准确,可靠地预测人类的姿势,尤其是对于小规模的人。此外,现有的自下而上方法采用多尺度测试来以多个额外的前向通行证的价格提高姿势估计的准确性,这削弱了自下而上方法的效率,与自上而下的方法相比,核心强度。相比之下,我们的多尺度训练使该模型能够预测单个前向通行证(即单尺度测试)中的高质量姿势。我们的方法在边界框的精度方面取得了38.4 \%的改进,在边界框上进行了39.1 \%的改进,以对可可的具有挑战性的小规模人群进行对现状(SOTA)的回忆(SOTA)。对于人类姿势AP评估,我们在带有单尺度测试的可可测试-DEV集中实现了新的SOTA(71.0 AP)。我们还在跨数据库评估中在Ochuman数据集上实现了最高的性能(40.3 AP)。
translated by 谷歌翻译
随着人工智能和机器学习的日益普及,文献中已经提出了针对深度学习模型的广泛攻击。逃避攻击和中毒攻击都试图利用对抗性变化的样本来欺骗受害者模型以错误地分类对抗样本。尽管这种攻击声称是隐形的,即对人的眼睛看不见,但很少评估这种说法。在本文中,我们介绍了第一个大规模研究,涉及对深度学习的攻击中使用的对抗样本的隐身性。我们已经对六个流行的基准数据集实施了20种代表性的对抗ML攻击。我们使用两种互补方法评估了攻击样本的隐身性:(1)一项数值研究,采用24个指标用于图像相似性或质量评估; (2)对3组问卷的用户研究,从1,000多个回答中收集了20,000多次注释。我们的结果表明,大多数现有攻击引入了不可忽略的扰动,这些扰动对人的眼睛并不隐秘。我们进一步分析了有助于攻击隐身性的因素。我们进一步研究了数值分析与用户研究之间的相关性,并证明某些图像质量指标可能在攻击设计中提供有用的指导,而评估的图像质量和攻击的视觉隐身性之间仍然存在显着差距。
translated by 谷歌翻译
如今,配备了AI系统的摄像机可以捕获和分析图像以自动检测人员。但是,当在现实世界(即物理对抗示例)中收到故意设计的模式时,AI系统可能会犯错误。先前的作品表明,可以在衣服上打印对抗斑块,以逃避基于DNN的人探测器。但是,当视角(即相机与物体的角度)变化时,这些对抗性示例可能会在攻击成功率中造成灾难性下降。要执行多角度攻击,我们提出了对抗纹理(Advexture)。 advtexture可以用任意形状覆盖衣服,以便穿着这样的衣服的人可以从不同的视角躲避人探测器。我们提出了一种生成方法,称为基于环形作用的可扩展生成攻击(TC-EGA),以用重复的结构来制作advexture。我们用advexure印刷了几块布,然后在物理世界中制作了T恤,裙子和连衣裙。实验表明,这些衣服可以欺骗物理世界中的人探测器。
translated by 谷歌翻译
尽管在现代的机器学习算法的最新进展,其内在机制的不透明仍是采用的障碍。在人工智能系统灌输信心和信任,解释的人工智能已成为提高现代机器学习算法explainability的响应。归纳逻辑程序(ILP),符号人工智能的子场中,起着产生,因为它的直观的逻辑驱动框架的可解释的解释有希望的作用。 ILP有效利用绎推理产生从实例和背景知识解释的一阶分句理论。然而,在发展中通过ILP需要启发方法的几个挑战,在实践中他们的成功应用来解决。例如,现有的ILP系统通常拥有广阔的解空间,以及感应解决方案是对噪声和干扰非常敏感。本次调查总结在ILP的最新进展和统计关系学习和神经象征算法的讨论,其中提供给ILP协同意见。继最新进展的严格审查,我们划定观察的挑战,突出对发展不言自明的人工智能系统进一步ILP动机研究的潜在途径。
translated by 谷歌翻译
人类的持续学习(CL)能力与稳定性与可塑性困境密切相关,描述了人类如何实现持续的学习能力和保存的学习信息。自发育以来,CL的概念始终存在于人工智能(AI)中。本文提出了对CL的全面审查。与之前的评论不同,主要关注CL中的灾难性遗忘现象,本文根据稳定性与可塑性机制的宏观视角来调查CL。类似于生物对应物,“智能”AI代理商应该是I)记住以前学到的信息(信息回流); ii)不断推断新信息(信息浏览:); iii)转移有用的信息(信息转移),以实现高级CL。根据分类学,评估度量,算法,应用以及一些打开问题。我们的主要贡献涉及I)从人工综合情报层面重新检查CL; ii)在CL主题提供详细和广泛的概述; iii)提出一些关于CL潜在发展的新颖思路。
translated by 谷歌翻译
知识图表(kgs)以头部关系的形式捕获知识 - 尾部三元组,是许多AI系统中的重要组成部分。 KGS上有两个重要的推理任务:(1)单跳知识图完成,涉及预测公斤中的各个环节; (2),多跳推理,目标是预测哪个kg实体满足给定的逻辑查询。基于嵌入的方法通过首先计算每个实体和关系的嵌入来解决两个任务,然后使用它们形成预测。但是,现有可扩展的KG嵌入框架仅支持单跳知识图完成,并且不能应用于更具挑战性的多跳推理任务。在这里,我们呈现可扩展的多跳推理(SMORE),这是KGS中单跳和多跳推理的第一个总框架。使用单机略微闪烁可以在FreeBase KG(86米实体,338M边缘)中执行多跳推理,比以前考虑的KGs大1,500倍。粉刷运行时性能的关键是一种新的双向抑制采样,实现了在线培训数据生成的复杂性的平方根降低。此外,SMORE利用异步调度,基于CPU的数据采样,基于GPU的嵌入计算和频繁CPU - GPU IO。 Smore通过2.2倍提高了82倍的吞吐量(即,训练速度),以最小的GPU存储器要求(2GB用于训练86M节点FreeBase上的400微米嵌入),并达到与GPU的数量接近线性加速。此外,在更简单的单跳知识图形完成任务中,Smore实现了对单个GPU和多GPU设置的最先进框架的可比或更好的运行时间性能。
translated by 谷歌翻译
The growing interest in intelligent services and privacy protection for mobile devices has given rise to the widespread application of federated learning in Multi-access Edge Computing (MEC). Diverse user behaviors call for personalized services with heterogeneous Machine Learning (ML) models on different devices. Federated Multi-task Learning (FMTL) is proposed to train related but personalized ML models for different devices, whereas previous works suffer from excessive communication overhead during training and neglect the model heterogeneity among devices in MEC. Introducing knowledge distillation into FMTL can simultaneously enable efficient communication and model heterogeneity among clients, whereas existing methods rely on a public dataset, which is impractical in reality. To tackle this dilemma, Federated MultI-task Distillation for Multi-access Edge CompuTing (FedICT) is proposed. FedICT direct local-global knowledge aloof during bi-directional distillation processes between clients and the server, aiming to enable multi-task clients while alleviating client drift derived from divergent optimization directions of client-side local models. Specifically, FedICT includes Federated Prior Knowledge Distillation (FPKD) and Local Knowledge Adjustment (LKA). FPKD is proposed to reinforce the clients' fitting of local data by introducing prior knowledge of local data distributions. Moreover, LKA is proposed to correct the distillation loss of the server, making the transferred local knowledge better match the generalized representation. Experiments on three datasets show that FedICT significantly outperforms all compared benchmarks in various data heterogeneous and model architecture settings, achieving improved accuracy with less than 1.2% training communication overhead compared with FedAvg and no more than 75% training communication round compared with FedGKT.
translated by 谷歌翻译